38 research outputs found

    On the `Semantics' of Differential Privacy: A Bayesian Formulation

    Full text link
    Differential privacy is a definition of "privacy'" for algorithms that analyze and publish information about statistical databases. It is often claimed that differential privacy provides guarantees against adversaries with arbitrary side information. In this paper, we provide a precise formulation of these guarantees in terms of the inferences drawn by a Bayesian adversary. We show that this formulation is satisfied by both "vanilla" differential privacy as well as a relaxation known as (epsilon,delta)-differential privacy. Our formulation follows the ideas originally due to Dwork and McSherry [Dwork 2006]. This paper is, to our knowledge, the first place such a formulation appears explicitly. The analysis of the relaxed definition is new to this paper, and provides some concrete guidance for setting parameters when using (epsilon,delta)-differential privacy.Comment: Older version of this paper was titled: "A Note on Differential Privacy: Defining Resistance to Arbitrary Side Information

    Spanners for Geometric Intersection Graphs

    Full text link
    Efficient algorithms are presented for constructing spanners in geometric intersection graphs. For a unit ball graph in R^k, a (1+\epsilon)-spanner is obtained using efficient partitioning of the space into hypercubes and solving bichromatic closest pair problems. The spanner construction has almost equivalent complexity to the construction of Euclidean minimum spanning trees. The results are extended to arbitrary ball graphs with a sub-quadratic running time. For unit ball graphs, the spanners have a small separator decomposition which can be used to obtain efficient algorithms for approximating proximity problems like diameter and distance queries. The results on compressed quadtrees, geometric graph separators, and diameter approximation might be of independent interest.Comment: 16 pages, 5 figures, Late

    Spectral Norm of Random Kernel Matrices with Applications to Privacy

    Get PDF
    Kernel methods are an extremely popular set of techniques used for many important machine learning and data analysis applications. In addition to having good practical performances, these methods are supported by a well-developed theory. Kernel methods use an implicit mapping of the input data into a high dimensional feature space defined by a kernel function, i.e., a function returning the inner product between the images of two data points in the feature space. Central to any kernel method is the kernel matrix, which is built by evaluating the kernel function on a given sample dataset. In this paper, we initiate the study of non-asymptotic spectral theory of random kernel matrices. These are n x n random matrices whose (i,j)th entry is obtained by evaluating the kernel function on xix_i and xjx_j, where x1,...,xnx_1,...,x_n are a set of n independent random high-dimensional vectors. Our main contribution is to obtain tight upper bounds on the spectral norm (largest eigenvalue) of random kernel matrices constructed by commonly used kernel functions based on polynomials and Gaussian radial basis. As an application of these results, we provide lower bounds on the distortion needed for releasing the coefficients of kernel ridge regression under attribute privacy, a general privacy notion which captures a large class of privacy definitions. Kernel ridge regression is standard method for performing non-parametric regression that regularly outperforms traditional regression approaches in various domains. Our privacy distortion lower bounds are the first for any kernel technique, and our analysis assumes realistic scenarios for the input, unlike all previous lower bounds for other release problems which only hold under very restrictive input settings.Comment: 16 pages, 1 Figur

    Debiasing Conditional Stochastic Optimization

    Full text link
    In this paper, we study the conditional stochastic optimization (CSO) problem which covers a variety of applications including portfolio selection, reinforcement learning, robust learning, causal inference, etc. The sample-averaged gradient of the CSO objective is biased due to its nested structure and therefore requires a high sample complexity to reach convergence. We introduce a general stochastic extrapolation technique that effectively reduces the bias. We show that for nonconvex smooth objectives, combining this extrapolation with variance reduction techniques can achieve a significantly better sample complexity than existing bounds. We also develop new algorithms for the finite-sum variant of CSO that also significantly improve upon existing results. Finally, we believe that our debiasing technique could be an interesting tool applicable to other stochastic optimization problems too
    corecore